Shallow parsing of Hungarian business news
نویسنده
چکیده
The present paper reports on an attempt to annotate noun phrases in Hungarian using cascaded regular grammars. Hungarian presents several difficulties to shallow parsing such as discourse oriented constituent order as well as left-branching recursive possessive and participle structure inside noun phrases. The approach uses cascaded regular grammars and was developed with the CLaRK system. The NP grammar was tested on a morphologically tagged and disambiguated corpus of 928 sentences representing a sample of highly sophisticated written style of journalism. The results are encouraging even on this challenging text type.
منابع مشابه
Assignment problem and its application in Nigerian institutions: Hungarian method approach
Assignment model is a powerful operations research techniques that can be used to solve assignment or allocation problem. This study applies the assignment model to the course allocation problem in Nigeria tertiary institution in order to maximize lecturers’ effectiveness. A well-structured questionnaire was used to obtain data from lecturers and solved with Hungarian method. The study revealed...
متن کاملIs It Possible to Export Swedish Publishing Systems to Hungary? Assessing business opportunities in Hungary for Swedish information systems supporting news publishing
This report sets out to assess business opportunities for Swedish IT companies that sell information systems to news publishers. The report consists of three major parts. First, it maps out the Swedish companies in this field of business to then categorise their offers. Next, it describes the Hungarian market and some of its largest players. Finally, using the categorisation obtained from the f...
متن کاملA Unification-based Approach to Morpho-syntactic Parsing of Agglutinative and Other (Highly) Inflectional Languages
This paper introduces a new approach to morpho-syntactic analysis through Humor 99 (High-speed Unification Mo.rphology), a reversible and unification-based morphological analyzer which has already been integrated with a variety of industrial applications. Humor 99 successfully copes with problems of agglutinative (e.g. Hungarian, Turkish, Estonian) and other (highly) inflectional languages (e.g...
متن کاملThe Szeged Treebank Project
The major aim of the Szeged Treebank project was to create a high-quality database of syntactic structures for Hungarian that can serve as a golden standard to further research in linguistics and computational language processing. The treebank currently contains full syntactic parsing of about 82,000 sentences (1.2 million words), which is the result of accurate manual annotation. Inspired by t...
متن کاملAccent assignment algorithm in Hungarian, based on syntactic analysis
This article presents the results of the research aimed at developing an accent assignment system for Hungarian. Two methods are compared. The shallow method targets local and short-distance factors that determine accent; the deep (syntactic) method targets long-distance influences (such as focus). Neither of the methods alone results in absolutely satisfactory output; frequently, however, mist...
متن کامل